Overview

Dataset statistics

Number of variables26
Number of observations8161
Missing cells2405
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 MiB
Average record size in memory976.7 B

Variable types

Numeric10
Categorical13
Boolean3

Alerts

INCOME has a high cardinality: 6612 distinct valuesHigh cardinality
HOME_VAL has a high cardinality: 5106 distinct valuesHigh cardinality
BLUEBOOK has a high cardinality: 2789 distinct valuesHigh cardinality
OLDCLAIM has a high cardinality: 2857 distinct valuesHigh cardinality
AGE is highly overall correlated with HOMEKIDSHigh correlation
HOMEKIDS is highly overall correlated with AGE and 1 other fieldsHigh correlation
PARENT1 is highly overall correlated with HOMEKIDSHigh correlation
SEX is highly overall correlated with CAR_TYPE and 1 other fieldsHigh correlation
EDUCATION is highly overall correlated with JOBHigh correlation
JOB is highly overall correlated with EDUCATION and 1 other fieldsHigh correlation
CAR_USE is highly overall correlated with JOB and 1 other fieldsHigh correlation
CAR_TYPE is highly overall correlated with SEX and 1 other fieldsHigh correlation
RED_CAR is highly overall correlated with SEXHigh correlation
KIDSDRIV is highly imbalanced (70.9%)Imbalance
OLDCLAIM is highly imbalanced (53.2%)Imbalance
YOJ has 454 (5.6%) missing valuesMissing
INCOME has 445 (5.5%) missing valuesMissing
HOME_VAL has 464 (5.7%) missing valuesMissing
JOB has 526 (6.4%) missing valuesMissing
CAR_AGE has 510 (6.2%) missing valuesMissing
INDEX is uniformly distributedUniform
INDEX has unique valuesUnique
TARGET_AMT has 6008 (73.6%) zerosZeros
HOMEKIDS has 5289 (64.8%) zerosZeros
YOJ has 625 (7.7%) zerosZeros
CLM_FREQ has 5009 (61.4%) zerosZeros
MVR_PTS has 3712 (45.5%) zerosZeros

Reproduction

Analysis started2023-02-17 18:02:00.097790
Analysis finished2023-02-17 18:02:33.695571
Duration33.6 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

INDEX
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct8161
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5151.8677
Minimum1
Maximum10302
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:34.068265image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile509
Q12559
median5133
Q37745
95-th percentile9791
Maximum10302
Range10301
Interquartile range (IQR)5186

Descriptive statistics

Standard deviation2978.894
Coefficient of variation (CV)0.57821632
Kurtosis-1.2029827
Mean5151.8677
Median Absolute Deviation (MAD)2591
Skewness0.0020046137
Sum42044392
Variance8873809.2
MonotonicityStrictly increasing
2023-02-17T19:02:34.524861image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
6874 1
 
< 0.1%
6890 1
 
< 0.1%
6889 1
 
< 0.1%
6888 1
 
< 0.1%
6887 1
 
< 0.1%
6886 1
 
< 0.1%
6884 1
 
< 0.1%
6883 1
 
< 0.1%
6882 1
 
< 0.1%
Other values (8151) 8151
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
ValueCountFrequency (%)
10302 1
< 0.1%
10301 1
< 0.1%
10299 1
< 0.1%
10298 1
< 0.1%
10297 1
< 0.1%
10296 1
< 0.1%
10295 1
< 0.1%
10293 1
< 0.1%
10292 1
< 0.1%
10291 1
< 0.1%

TARGET_FLAG
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size462.4 KiB
0
6008 
1
2153 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8161
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6008
73.6%
1 2153
 
26.4%

Length

2023-02-17T19:02:34.914253image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:35.256296image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 6008
73.6%
1 2153
 
26.4%

Most occurring characters

ValueCountFrequency (%)
0 6008
73.6%
1 2153
 
26.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8161
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6008
73.6%
1 2153
 
26.4%

Most occurring scripts

ValueCountFrequency (%)
Common 8161
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6008
73.6%
1 2153
 
26.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8161
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6008
73.6%
1 2153
 
26.4%

TARGET_AMT
Real number (ℝ)

Distinct1949
Distinct (%)23.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1504.3246
Minimum0
Maximum107586.14
Zeros6008
Zeros (%)73.6%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:35.661547image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31036
95-th percentile6452
Maximum107586.14
Range107586.14
Interquartile range (IQR)1036

Descriptive statistics

Standard deviation4704.0269
Coefficient of variation (CV)3.1270025
Kurtosis112.38628
Mean1504.3246
Median Absolute Deviation (MAD)0
Skewness8.7095047
Sum12276793
Variance22127869
MonotonicityNot monotonic
2023-02-17T19:02:35.966373image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6008
73.6%
2327 4
 
< 0.1%
3350 3
 
< 0.1%
980 3
 
< 0.1%
3667 3
 
< 0.1%
2489 3
 
< 0.1%
2493 3
 
< 0.1%
5453 3
 
< 0.1%
3501 3
 
< 0.1%
5728 3
 
< 0.1%
Other values (1939) 2125
 
26.0%
ValueCountFrequency (%)
0 6008
73.6%
30.27728015 1
 
< 0.1%
58.53106231 1
 
< 0.1%
95.56731717 1
 
< 0.1%
108.7414986 1
 
< 0.1%
159.1509202 1
 
< 0.1%
196.1468185 1
 
< 0.1%
223.6120015 1
 
< 0.1%
262.0385439 1
 
< 0.1%
291.7285708 1
 
< 0.1%
ValueCountFrequency (%)
107586.1362 1
< 0.1%
85523.65335 1
< 0.1%
78874.19056 1
< 0.1%
77907.43028 1
< 0.1%
73783.46592 1
< 0.1%
64181.71033 1
< 0.1%
60846.53042 1
< 0.1%
60838.10394 1
< 0.1%
58851.06776 1
< 0.1%
56399.75387 1
< 0.1%

KIDSDRIV
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size462.4 KiB
0
7180 
1
 
636
2
 
279
3
 
62
4
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8161
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 7180
88.0%
1 636
 
7.8%
2 279
 
3.4%
3 62
 
0.8%
4 4
 
< 0.1%

Length

2023-02-17T19:02:36.257666image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:36.534303image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 7180
88.0%
1 636
 
7.8%
2 279
 
3.4%
3 62
 
0.8%
4 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 7180
88.0%
1 636
 
7.8%
2 279
 
3.4%
3 62
 
0.8%
4 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8161
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7180
88.0%
1 636
 
7.8%
2 279
 
3.4%
3 62
 
0.8%
4 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 8161
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7180
88.0%
1 636
 
7.8%
2 279
 
3.4%
3 62
 
0.8%
4 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8161
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7180
88.0%
1 636
 
7.8%
2 279
 
3.4%
3 62
 
0.8%
4 4
 
< 0.1%

AGE
Real number (ℝ)

Distinct60
Distinct (%)0.7%
Missing6
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean44.790313
Minimum16
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:36.800306image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile30
Q139
median45
Q351
95-th percentile59
Maximum81
Range65
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.6275895
Coefficient of variation (CV)0.19262177
Kurtosis-0.060282517
Mean44.790313
Median Absolute Deviation (MAD)6
Skewness-0.028999616
Sum365265
Variance74.4353
MonotonicityNot monotonic
2023-02-17T19:02:37.010735image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46 401
 
4.9%
45 376
 
4.6%
48 363
 
4.4%
47 355
 
4.3%
43 351
 
4.3%
41 336
 
4.1%
44 336
 
4.1%
42 333
 
4.1%
50 329
 
4.0%
40 317
 
3.9%
Other values (50) 4658
57.1%
ValueCountFrequency (%)
16 5
 
0.1%
17 1
 
< 0.1%
18 3
 
< 0.1%
19 5
 
0.1%
20 3
 
< 0.1%
21 11
0.1%
22 14
0.2%
23 9
 
0.1%
24 21
0.3%
25 24
0.3%
ValueCountFrequency (%)
81 1
 
< 0.1%
80 1
 
< 0.1%
76 1
 
< 0.1%
73 3
 
< 0.1%
72 3
 
< 0.1%
70 6
 
0.1%
69 4
 
< 0.1%
68 8
 
0.1%
67 12
0.1%
66 24
0.3%

HOMEKIDS
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.72123514
Minimum0
Maximum5
Zeros5289
Zeros (%)64.8%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:37.411345image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1163233
Coefficient of variation (CV)1.5477938
Kurtosis0.65101978
Mean0.72123514
Median Absolute Deviation (MAD)0
Skewness1.3416202
Sum5886
Variance1.2461777
MonotonicityNot monotonic
2023-02-17T19:02:37.571646image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 5289
64.8%
2 1118
 
13.7%
1 902
 
11.1%
3 674
 
8.3%
4 164
 
2.0%
5 14
 
0.2%
ValueCountFrequency (%)
0 5289
64.8%
1 902
 
11.1%
2 1118
 
13.7%
3 674
 
8.3%
4 164
 
2.0%
5 14
 
0.2%
ValueCountFrequency (%)
5 14
 
0.2%
4 164
 
2.0%
3 674
 
8.3%
2 1118
 
13.7%
1 902
 
11.1%
0 5289
64.8%

YOJ
Real number (ℝ)

MISSING  ZEROS 

Distinct21
Distinct (%)0.3%
Missing454
Missing (%)5.6%
Infinite0
Infinite (%)0.0%
Mean10.499286
Minimum0
Maximum23
Zeros625
Zeros (%)7.7%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:37.785680image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median11
Q313
95-th percentile15
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.0924742
Coefficient of variation (CV)0.38978594
Kurtosis1.179969
Mean10.499286
Median Absolute Deviation (MAD)2
Skewness-1.203436
Sum80918
Variance16.748345
MonotonicityNot monotonic
2023-02-17T19:02:37.946045image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
12 1158
14.2%
13 1016
12.4%
11 1003
12.3%
14 785
9.6%
10 749
9.2%
0 625
7.7%
9 521
6.4%
15 463
 
5.7%
8 384
 
4.7%
7 300
 
3.7%
Other values (11) 703
8.6%
(Missing) 454
 
5.6%
ValueCountFrequency (%)
0 625
7.7%
1 6
 
0.1%
2 15
 
0.2%
3 36
 
0.4%
4 37
 
0.5%
5 92
 
1.1%
6 173
 
2.1%
7 300
3.7%
8 384
4.7%
9 521
6.4%
ValueCountFrequency (%)
23 2
 
< 0.1%
19 12
 
0.1%
18 25
 
0.3%
17 101
 
1.2%
16 204
 
2.5%
15 463
 
5.7%
14 785
9.6%
13 1016
12.4%
12 1158
14.2%
11 1003
12.3%

INCOME
Categorical

HIGH CARDINALITY  MISSING 

Distinct6612
Distinct (%)85.7%
Missing445
Missing (%)5.5%
Memory size494.2 KiB
$0
 
615
$61,790
 
4
$26,840
 
4
$48,509
 
4
$158
 
3
Other values (6607)
7086 

Length

Max length8
Median length7
Mean length6.7226542
Min length2

Characters and Unicode

Total characters51872
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6156 ?
Unique (%)79.8%

Sample

1st row$67,349
2nd row$91,449
3rd row$16,039
4th row$114,986
5th row$125,301

Common Values

ValueCountFrequency (%)
$0 615
 
7.5%
$61,790 4
 
< 0.1%
$26,840 4
 
< 0.1%
$48,509 4
 
< 0.1%
$158 3
 
< 0.1%
$65,885 3
 
< 0.1%
$48,741 3
 
< 0.1%
$47,513 3
 
< 0.1%
$23,157 3
 
< 0.1%
$20,887 3
 
< 0.1%
Other values (6602) 7071
86.6%
(Missing) 445
 
5.5%

Length

2023-02-17T19:02:38.202777image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 615
 
8.0%
26,840 4
 
0.1%
48,509 4
 
0.1%
61,790 4
 
0.1%
64,032 3
 
< 0.1%
31,407 3
 
< 0.1%
54,691 3
 
< 0.1%
143,073 3
 
< 0.1%
22,362 3
 
< 0.1%
183,296 3
 
< 0.1%
Other values (6602) 7071
91.6%

Most occurring characters

ValueCountFrequency (%)
$ 7716
14.9%
, 7053
13.6%
1 4734
9.1%
2 3897
7.5%
3 3810
7.3%
0 3800
7.3%
4 3682
7.1%
5 3667
7.1%
6 3613
7.0%
7 3402
6.6%
Other values (2) 6498
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37103
71.5%
Currency Symbol 7716
 
14.9%
Other Punctuation 7053
 
13.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4734
12.8%
2 3897
10.5%
3 3810
10.3%
0 3800
10.2%
4 3682
9.9%
5 3667
9.9%
6 3613
9.7%
7 3402
9.2%
9 3262
8.8%
8 3236
8.7%
Currency Symbol
ValueCountFrequency (%)
$ 7716
100.0%
Other Punctuation
ValueCountFrequency (%)
, 7053
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 51872
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 7716
14.9%
, 7053
13.6%
1 4734
9.1%
2 3897
7.5%
3 3810
7.3%
0 3800
7.3%
4 3682
7.1%
5 3667
7.1%
6 3613
7.0%
7 3402
6.6%
Other values (2) 6498
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51872
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 7716
14.9%
, 7053
13.6%
1 4734
9.1%
2 3897
7.5%
3 3810
7.3%
0 3800
7.3%
4 3682
7.1%
5 3667
7.1%
6 3613
7.0%
7 3402
6.6%
Other values (2) 6498
12.5%

PARENT1
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
False
7084 
True
1077 
ValueCountFrequency (%)
False 7084
86.8%
True 1077
 
13.2%
2023-02-17T19:02:38.511733image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

HOME_VAL
Categorical

HIGH CARDINALITY  MISSING 

Distinct5106
Distinct (%)66.3%
Missing464
Missing (%)5.7%
Memory size489.4 KiB
$0
2294 
$115,249
 
3
$288,592
 
3
$173,130
 
3
$153,061
 
3
Other values (5101)
5391 

Length

Max length8
Median length8
Mean length6.1635702
Min length2

Characters and Unicode

Total characters47441
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4819 ?
Unique (%)62.6%

Sample

1st row$0
2nd row$257,252
3rd row$124,191
4th row$306,251
5th row$243,925

Common Values

ValueCountFrequency (%)
$0 2294
28.1%
$115,249 3
 
< 0.1%
$288,592 3
 
< 0.1%
$173,130 3
 
< 0.1%
$153,061 3
 
< 0.1%
$159,568 3
 
< 0.1%
$123,109 3
 
< 0.1%
$332,673 3
 
< 0.1%
$238,724 3
 
< 0.1%
$111,129 3
 
< 0.1%
Other values (5096) 5376
65.9%
(Missing) 464
 
5.7%

Length

2023-02-17T19:02:38.798525image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 2294
29.8%
332,673 3
 
< 0.1%
115,249 3
 
< 0.1%
166,481 3
 
< 0.1%
319,400 3
 
< 0.1%
111,129 3
 
< 0.1%
238,724 3
 
< 0.1%
196,320 3
 
< 0.1%
123,109 3
 
< 0.1%
159,568 3
 
< 0.1%
Other values (5096) 5376
69.8%

Most occurring characters

ValueCountFrequency (%)
$ 7697
16.2%
, 5403
11.4%
0 5053
10.7%
1 4855
10.2%
2 4668
9.8%
3 3302
7.0%
4 2807
 
5.9%
5 2787
 
5.9%
6 2727
 
5.7%
8 2717
 
5.7%
Other values (2) 5425
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34341
72.4%
Currency Symbol 7697
 
16.2%
Other Punctuation 5403
 
11.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5053
14.7%
1 4855
14.1%
2 4668
13.6%
3 3302
9.6%
4 2807
8.2%
5 2787
8.1%
6 2727
7.9%
8 2717
7.9%
7 2717
7.9%
9 2708
7.9%
Currency Symbol
ValueCountFrequency (%)
$ 7697
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5403
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 47441
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 7697
16.2%
, 5403
11.4%
0 5053
10.7%
1 4855
10.2%
2 4668
9.8%
3 3302
7.0%
4 2807
 
5.9%
5 2787
 
5.9%
6 2727
 
5.7%
8 2717
 
5.7%
Other values (2) 5425
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47441
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 7697
16.2%
, 5403
11.4%
0 5053
10.7%
1 4855
10.2%
2 4668
9.8%
3 3302
7.0%
4 2807
 
5.9%
5 2787
 
5.9%
6 2727
 
5.7%
8 2717
 
5.7%
Other values (2) 5425
11.4%

MSTATUS
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size481.5 KiB
Yes
4894 
z_No
3267 

Length

Max length4
Median length3
Mean length3.4003186
Min length3

Characters and Unicode

Total characters27750
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowz_No
2nd rowz_No
3rd rowYes
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
Yes 4894
60.0%
z_No 3267
40.0%

Length

2023-02-17T19:02:39.110268image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:39.480934image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
yes 4894
60.0%
z_no 3267
40.0%

Most occurring characters

ValueCountFrequency (%)
Y 4894
17.6%
e 4894
17.6%
s 4894
17.6%
z 3267
11.8%
_ 3267
11.8%
N 3267
11.8%
o 3267
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16322
58.8%
Uppercase Letter 8161
29.4%
Connector Punctuation 3267
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4894
30.0%
s 4894
30.0%
z 3267
20.0%
o 3267
20.0%
Uppercase Letter
ValueCountFrequency (%)
Y 4894
60.0%
N 3267
40.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3267
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24483
88.2%
Common 3267
 
11.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 4894
20.0%
e 4894
20.0%
s 4894
20.0%
z 3267
13.3%
N 3267
13.3%
o 3267
13.3%
Common
ValueCountFrequency (%)
_ 3267
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 4894
17.6%
e 4894
17.6%
s 4894
17.6%
z 3267
11.8%
_ 3267
11.8%
N 3267
11.8%
o 3267
11.8%

SEX
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size470.9 KiB
z_F
4375 
M
3786 

Length

Max length3
Median length3
Mean length2.0721725
Min length1

Characters and Unicode

Total characters16911
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowz_F
4th rowM
5th rowz_F

Common Values

ValueCountFrequency (%)
z_F 4375
53.6%
M 3786
46.4%

Length

2023-02-17T19:02:39.788747image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:40.105035image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
z_f 4375
53.6%
m 3786
46.4%

Most occurring characters

ValueCountFrequency (%)
z 4375
25.9%
_ 4375
25.9%
F 4375
25.9%
M 3786
22.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8161
48.3%
Lowercase Letter 4375
25.9%
Connector Punctuation 4375
25.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 4375
53.6%
M 3786
46.4%
Lowercase Letter
ValueCountFrequency (%)
z 4375
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4375
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12536
74.1%
Common 4375
 
25.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
z 4375
34.9%
F 4375
34.9%
M 3786
30.2%
Common
ValueCountFrequency (%)
_ 4375
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16911
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
z 4375
25.9%
_ 4375
25.9%
F 4375
25.9%
M 3786
22.4%

EDUCATION
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size531.2 KiB
z_High School
2330 
Bachelors
2242 
Masters
1658 
<High School
1203 
PhD
728 

Length

Max length13
Median length12
Mean length9.6426908
Min length3

Characters and Unicode

Total characters78694
Distinct characters21
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPhD
2nd rowz_High School
3rd rowz_High School
4th row<High School
5th rowPhD

Common Values

ValueCountFrequency (%)
z_High School 2330
28.6%
Bachelors 2242
27.5%
Masters 1658
20.3%
<High School 1203
14.7%
PhD 728
 
8.9%

Length

2023-02-17T19:02:40.387969image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:40.812760image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
school 3533
30.2%
z_high 2330
19.9%
bachelors 2242
19.2%
masters 1658
14.2%
high 1203
 
10.3%
phd 728
 
6.2%

Most occurring characters

ValueCountFrequency (%)
h 10036
12.8%
o 9308
 
11.8%
l 5775
 
7.3%
c 5775
 
7.3%
s 5558
 
7.1%
e 3900
 
5.0%
r 3900
 
5.0%
a 3900
 
5.0%
H 3533
 
4.5%
i 3533
 
4.5%
Other values (11) 23476
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 59206
75.2%
Uppercase Letter 12422
 
15.8%
Space Separator 3533
 
4.5%
Connector Punctuation 2330
 
3.0%
Math Symbol 1203
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
h 10036
17.0%
o 9308
15.7%
l 5775
9.8%
c 5775
9.8%
s 5558
9.4%
e 3900
 
6.6%
r 3900
 
6.6%
a 3900
 
6.6%
i 3533
 
6.0%
g 3533
 
6.0%
Other values (2) 3988
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
H 3533
28.4%
S 3533
28.4%
B 2242
18.0%
M 1658
13.3%
P 728
 
5.9%
D 728
 
5.9%
Space Separator
ValueCountFrequency (%)
3533
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2330
100.0%
Math Symbol
ValueCountFrequency (%)
< 1203
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 71628
91.0%
Common 7066
 
9.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
h 10036
14.0%
o 9308
13.0%
l 5775
 
8.1%
c 5775
 
8.1%
s 5558
 
7.8%
e 3900
 
5.4%
r 3900
 
5.4%
a 3900
 
5.4%
H 3533
 
4.9%
i 3533
 
4.9%
Other values (8) 16410
22.9%
Common
ValueCountFrequency (%)
3533
50.0%
_ 2330
33.0%
< 1203
 
17.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
h 10036
12.8%
o 9308
 
11.8%
l 5775
 
7.3%
c 5775
 
7.3%
s 5558
 
7.1%
e 3900
 
5.0%
r 3900
 
5.0%
a 3900
 
5.0%
H 3533
 
4.5%
i 3533
 
4.5%
Other values (11) 23476
29.8%

JOB
Categorical

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)0.1%
Missing526
Missing (%)6.4%
Memory size512.0 KiB
z_Blue Collar
1825 
Clerical
1271 
Professional
1117 
Manager
988 
Lawyer
835 
Other values (3)
1599 

Length

Max length13
Median length10
Mean length9.4424361
Min length6

Characters and Unicode

Total characters72093
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowProfessional
2nd rowz_Blue Collar
3rd rowClerical
4th rowz_Blue Collar
5th rowDoctor

Common Values

ValueCountFrequency (%)
z_Blue Collar 1825
22.4%
Clerical 1271
15.6%
Professional 1117
13.7%
Manager 988
12.1%
Lawyer 835
10.2%
Student 712
 
8.7%
Home Maker 641
 
7.9%
Doctor 246
 
3.0%
(Missing) 526
 
6.4%

Length

2023-02-17T19:02:41.251513image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:41.642953image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
z_blue 1825
18.1%
collar 1825
18.1%
clerical 1271
12.6%
professional 1117
11.1%
manager 988
9.8%
lawyer 835
8.3%
student 712
 
7.0%
home 641
 
6.3%
maker 641
 
6.3%
doctor 246
 
2.4%

Most occurring characters

ValueCountFrequency (%)
l 9134
12.7%
e 8030
 
11.1%
a 7665
 
10.6%
r 6923
 
9.6%
o 5192
 
7.2%
C 3096
 
4.3%
n 2817
 
3.9%
u 2537
 
3.5%
2466
 
3.4%
i 2388
 
3.3%
Other values (19) 21845
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57701
80.0%
Uppercase Letter 10101
 
14.0%
Space Separator 2466
 
3.4%
Connector Punctuation 1825
 
2.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 9134
15.8%
e 8030
13.9%
a 7665
13.3%
r 6923
12.0%
o 5192
9.0%
n 2817
 
4.9%
u 2537
 
4.4%
i 2388
 
4.1%
s 2234
 
3.9%
z 1825
 
3.2%
Other values (9) 8956
15.5%
Uppercase Letter
ValueCountFrequency (%)
C 3096
30.7%
B 1825
18.1%
M 1629
16.1%
P 1117
 
11.1%
L 835
 
8.3%
S 712
 
7.0%
H 641
 
6.3%
D 246
 
2.4%
Space Separator
ValueCountFrequency (%)
2466
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1825
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 67802
94.0%
Common 4291
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 9134
13.5%
e 8030
11.8%
a 7665
11.3%
r 6923
 
10.2%
o 5192
 
7.7%
C 3096
 
4.6%
n 2817
 
4.2%
u 2537
 
3.7%
i 2388
 
3.5%
s 2234
 
3.3%
Other values (17) 17786
26.2%
Common
ValueCountFrequency (%)
2466
57.5%
_ 1825
42.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72093
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 9134
12.7%
e 8030
 
11.1%
a 7665
 
10.6%
r 6923
 
9.6%
o 5192
 
7.2%
C 3096
 
4.3%
n 2817
 
3.9%
u 2537
 
3.5%
2466
 
3.4%
i 2388
 
3.3%
Other values (19) 21845
30.3%

TRAVTIME
Real number (ℝ)

Distinct97
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.485725
Minimum5
Maximum142
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:42.194588image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile7
Q122
median33
Q344
95-th percentile60
Maximum142
Range137
Interquartile range (IQR)22

Descriptive statistics

Standard deviation15.908333
Coefficient of variation (CV)0.47507807
Kurtosis0.66637462
Mean33.485725
Median Absolute Deviation (MAD)11
Skewness0.44698169
Sum273277
Variance253.07507
MonotonicityNot monotonic
2023-02-17T19:02:42.399589image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 334
 
4.1%
35 219
 
2.7%
30 219
 
2.7%
32 214
 
2.6%
25 214
 
2.6%
36 211
 
2.6%
29 207
 
2.5%
33 206
 
2.5%
24 204
 
2.5%
37 202
 
2.5%
Other values (87) 5931
72.7%
ValueCountFrequency (%)
5 334
4.1%
6 49
 
0.6%
7 43
 
0.5%
8 54
 
0.7%
9 70
 
0.9%
10 87
 
1.1%
11 71
 
0.9%
12 97
 
1.2%
13 97
 
1.2%
14 102
 
1.2%
ValueCountFrequency (%)
142 1
< 0.1%
134 1
< 0.1%
124 1
< 0.1%
113 1
< 0.1%
103 1
< 0.1%
101 1
< 0.1%
98 1
< 0.1%
97 2
< 0.1%
95 2
< 0.1%
93 1
< 0.1%

CAR_USE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size519.1 KiB
Private
5132 
Commercial
3029 

Length

Max length10
Median length7
Mean length8.1134665
Min length7

Characters and Unicode

Total characters66214
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate
2nd rowCommercial
3rd rowPrivate
4th rowPrivate
5th rowPrivate

Common Values

ValueCountFrequency (%)
Private 5132
62.9%
Commercial 3029
37.1%

Length

2023-02-17T19:02:42.688924image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:43.011792image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
private 5132
62.9%
commercial 3029
37.1%

Most occurring characters

ValueCountFrequency (%)
r 8161
12.3%
i 8161
12.3%
a 8161
12.3%
e 8161
12.3%
m 6058
9.1%
P 5132
7.8%
v 5132
7.8%
t 5132
7.8%
C 3029
 
4.6%
o 3029
 
4.6%
Other values (2) 6058
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 58053
87.7%
Uppercase Letter 8161
 
12.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 8161
14.1%
i 8161
14.1%
a 8161
14.1%
e 8161
14.1%
m 6058
10.4%
v 5132
8.8%
t 5132
8.8%
o 3029
 
5.2%
c 3029
 
5.2%
l 3029
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
P 5132
62.9%
C 3029
37.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 66214
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 8161
12.3%
i 8161
12.3%
a 8161
12.3%
e 8161
12.3%
m 6058
9.1%
P 5132
7.8%
v 5132
7.8%
t 5132
7.8%
C 3029
 
4.6%
o 3029
 
4.6%
Other values (2) 6058
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66214
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 8161
12.3%
i 8161
12.3%
a 8161
12.3%
e 8161
12.3%
m 6058
9.1%
P 5132
7.8%
v 5132
7.8%
t 5132
7.8%
C 3029
 
4.6%
o 3029
 
4.6%
Other values (2) 6058
9.1%

BLUEBOOK
Categorical

Distinct2789
Distinct (%)34.2%
Missing0
Missing (%)0.0%
Memory size507.9 KiB
$1,500
 
157
$6,000
 
34
$6,200
 
33
$5,800
 
33
$6,400
 
31
Other values (2784)
7873 

Length

Max length7
Median length7
Mean length6.717069
Min length6

Characters and Unicode

Total characters54818
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique900 ?
Unique (%)11.0%

Sample

1st row$14,230
2nd row$14,940
3rd row$4,010
4th row$15,440
5th row$18,000

Common Values

ValueCountFrequency (%)
$1,500 157
 
1.9%
$6,000 34
 
0.4%
$6,200 33
 
0.4%
$5,800 33
 
0.4%
$6,400 31
 
0.4%
$5,900 30
 
0.4%
$6,100 30
 
0.4%
$6,500 29
 
0.4%
$5,400 28
 
0.3%
$5,600 26
 
0.3%
Other values (2779) 7730
94.7%

Length

2023-02-17T19:02:43.305546image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1,500 157
 
1.9%
6,000 34
 
0.4%
6,200 33
 
0.4%
5,800 33
 
0.4%
6,400 31
 
0.4%
5,900 30
 
0.4%
6,100 30
 
0.4%
6,500 29
 
0.4%
5,400 28
 
0.3%
5,600 26
 
0.3%
Other values (2779) 7730
94.7%

Most occurring characters

ValueCountFrequency (%)
0 11189
20.4%
$ 8161
14.9%
, 8161
14.9%
1 6037
11.0%
2 4003
 
7.3%
3 2701
 
4.9%
5 2636
 
4.8%
6 2519
 
4.6%
4 2418
 
4.4%
7 2399
 
4.4%
Other values (2) 4594
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 38496
70.2%
Currency Symbol 8161
 
14.9%
Other Punctuation 8161
 
14.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 11189
29.1%
1 6037
15.7%
2 4003
 
10.4%
3 2701
 
7.0%
5 2636
 
6.8%
6 2519
 
6.5%
4 2418
 
6.3%
7 2399
 
6.2%
8 2324
 
6.0%
9 2270
 
5.9%
Currency Symbol
ValueCountFrequency (%)
$ 8161
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8161
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 54818
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 11189
20.4%
$ 8161
14.9%
, 8161
14.9%
1 6037
11.0%
2 4003
 
7.3%
3 2701
 
4.9%
5 2636
 
4.8%
6 2519
 
4.6%
4 2418
 
4.4%
7 2399
 
4.4%
Other values (2) 4594
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54818
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 11189
20.4%
$ 8161
14.9%
, 8161
14.9%
1 6037
11.0%
2 4003
 
7.3%
3 2701
 
4.9%
5 2636
 
4.8%
6 2519
 
4.6%
4 2418
 
4.4%
7 2399
 
4.4%
Other values (2) 4594
8.4%

TIF
Real number (ℝ)

Distinct23
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.351305
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:43.549930image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median4
Q37
95-th percentile13
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.1466353
Coefficient of variation (CV)0.77488301
Kurtosis0.42432793
Mean5.351305
Median Absolute Deviation (MAD)3
Skewness0.89113956
Sum43672
Variance17.194584
MonotonicityNot monotonic
2023-02-17T19:02:43.745463image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1 2533
31.0%
6 1341
16.4%
4 1242
15.2%
10 780
 
9.6%
7 620
 
7.6%
3 424
 
5.2%
13 278
 
3.4%
11 242
 
3.0%
9 225
 
2.8%
17 104
 
1.3%
Other values (13) 372
 
4.6%
ValueCountFrequency (%)
1 2533
31.0%
2 6
 
0.1%
3 424
 
5.2%
4 1242
15.2%
5 52
 
0.6%
6 1341
16.4%
7 620
 
7.6%
8 60
 
0.7%
9 225
 
2.8%
10 780
 
9.6%
ValueCountFrequency (%)
25 2
 
< 0.1%
22 3
 
< 0.1%
21 11
 
0.1%
20 8
 
0.1%
19 8
 
0.1%
18 24
 
0.3%
17 104
1.3%
16 44
0.5%
15 31
 
0.4%
14 78
1.0%

CAR_TYPE
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size506.7 KiB
z_SUV
2294 
Minivan
2145 
Pickup
1389 
Sports Car
907 
Van
750 

Length

Max length11
Median length10
Mean length6.5647592
Min length3

Characters and Unicode

Total characters53575
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMinivan
2nd rowMinivan
3rd rowz_SUV
4th rowMinivan
5th rowz_SUV

Common Values

ValueCountFrequency (%)
z_SUV 2294
28.1%
Minivan 2145
26.3%
Pickup 1389
17.0%
Sports Car 907
 
11.1%
Van 750
 
9.2%
Panel Truck 676
 
8.3%

Length

2023-02-17T19:02:43.992116image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:44.482379image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
z_suv 2294
23.5%
minivan 2145
22.0%
pickup 1389
14.3%
sports 907
 
9.3%
car 907
 
9.3%
van 750
 
7.7%
panel 676
 
6.9%
truck 676
 
6.9%

Most occurring characters

ValueCountFrequency (%)
n 5716
 
10.7%
i 5679
 
10.6%
a 4478
 
8.4%
S 3201
 
6.0%
V 3044
 
5.7%
r 2490
 
4.6%
p 2296
 
4.3%
z 2294
 
4.3%
U 2294
 
4.3%
_ 2294
 
4.3%
Other values (14) 19789
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35366
66.0%
Uppercase Letter 14332
26.8%
Connector Punctuation 2294
 
4.3%
Space Separator 1583
 
3.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 5716
16.2%
i 5679
16.1%
a 4478
12.7%
r 2490
7.0%
p 2296
6.5%
z 2294
6.5%
v 2145
 
6.1%
u 2065
 
5.8%
k 2065
 
5.8%
c 2065
 
5.8%
Other values (5) 4073
11.5%
Uppercase Letter
ValueCountFrequency (%)
S 3201
22.3%
V 3044
21.2%
U 2294
16.0%
M 2145
15.0%
P 2065
14.4%
C 907
 
6.3%
T 676
 
4.7%
Connector Punctuation
ValueCountFrequency (%)
_ 2294
100.0%
Space Separator
ValueCountFrequency (%)
1583
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 49698
92.8%
Common 3877
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 5716
 
11.5%
i 5679
 
11.4%
a 4478
 
9.0%
S 3201
 
6.4%
V 3044
 
6.1%
r 2490
 
5.0%
p 2296
 
4.6%
z 2294
 
4.6%
U 2294
 
4.6%
M 2145
 
4.3%
Other values (12) 16061
32.3%
Common
ValueCountFrequency (%)
_ 2294
59.2%
1583
40.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 5716
 
10.7%
i 5679
 
10.6%
a 4478
 
8.4%
S 3201
 
6.0%
V 3044
 
5.7%
r 2490
 
4.6%
p 2296
 
4.3%
z 2294
 
4.3%
U 2294
 
4.3%
_ 2294
 
4.3%
Other values (14) 19789
36.9%

RED_CAR
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
False
5783 
True
2378 
ValueCountFrequency (%)
False 5783
70.9%
True 2378
29.1%
2023-02-17T19:02:44.960541image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

OLDCLAIM
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct2857
Distinct (%)35.0%
Missing0
Missing (%)0.0%
Memory size483.2 KiB
$0
5009 
$4,263
 
4
$1,391
 
4
$1,310
 
4
$4,538
 
3
Other values (2852)
3137 

Length

Max length7
Median length2
Mean length3.6147531
Min length2

Characters and Unicode

Total characters29500
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2592 ?
Unique (%)31.8%

Sample

1st row$4,461
2nd row$0
3rd row$38,690
4th row$0
5th row$19,217

Common Values

ValueCountFrequency (%)
$0 5009
61.4%
$4,263 4
 
< 0.1%
$1,391 4
 
< 0.1%
$1,310 4
 
< 0.1%
$4,538 3
 
< 0.1%
$6,281 3
 
< 0.1%
$5,289 3
 
< 0.1%
$3,863 3
 
< 0.1%
$1,552 3
 
< 0.1%
$1,994 3
 
< 0.1%
Other values (2847) 3122
38.3%

Length

2023-02-17T19:02:45.198175image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 5009
61.4%
1,391 4
 
< 0.1%
1,310 4
 
< 0.1%
4,263 4
 
< 0.1%
1,332 3
 
< 0.1%
3,068 3
 
< 0.1%
5,863 3
 
< 0.1%
4,582 3
 
< 0.1%
3,338 3
 
< 0.1%
1,105 3
 
< 0.1%
Other values (2847) 3122
38.3%

Most occurring characters

ValueCountFrequency (%)
$ 8161
27.7%
0 6064
20.6%
, 3053
 
10.3%
3 1611
 
5.5%
1 1554
 
5.3%
4 1421
 
4.8%
2 1388
 
4.7%
5 1380
 
4.7%
6 1299
 
4.4%
7 1220
 
4.1%
Other values (2) 2349
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18286
62.0%
Currency Symbol 8161
27.7%
Other Punctuation 3053
 
10.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6064
33.2%
3 1611
 
8.8%
1 1554
 
8.5%
4 1421
 
7.8%
2 1388
 
7.6%
5 1380
 
7.5%
6 1299
 
7.1%
7 1220
 
6.7%
8 1199
 
6.6%
9 1150
 
6.3%
Currency Symbol
ValueCountFrequency (%)
$ 8161
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3053
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29500
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 8161
27.7%
0 6064
20.6%
, 3053
 
10.3%
3 1611
 
5.5%
1 1554
 
5.3%
4 1421
 
4.8%
2 1388
 
4.7%
5 1380
 
4.7%
6 1299
 
4.4%
7 1220
 
4.1%
Other values (2) 2349
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 8161
27.7%
0 6064
20.6%
, 3053
 
10.3%
3 1611
 
5.5%
1 1554
 
5.3%
4 1421
 
4.8%
2 1388
 
4.7%
5 1380
 
4.7%
6 1299
 
4.4%
7 1220
 
4.1%
Other values (2) 2349
 
8.0%

CLM_FREQ
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7985541
Minimum0
Maximum5
Zeros5009
Zeros (%)61.4%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:45.383343image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.1584527
Coefficient of variation (CV)1.4506878
Kurtosis0.2860043
Mean0.7985541
Median Absolute Deviation (MAD)0
Skewness1.209243
Sum6517
Variance1.3420126
MonotonicityNot monotonic
2023-02-17T19:02:45.614541image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 5009
61.4%
2 1171
 
14.3%
1 997
 
12.2%
3 776
 
9.5%
4 190
 
2.3%
5 18
 
0.2%
ValueCountFrequency (%)
0 5009
61.4%
1 997
 
12.2%
2 1171
 
14.3%
3 776
 
9.5%
4 190
 
2.3%
5 18
 
0.2%
ValueCountFrequency (%)
5 18
 
0.2%
4 190
 
2.3%
3 776
 
9.5%
2 1171
 
14.3%
1 997
 
12.2%
0 5009
61.4%

REVOKED
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
False
7161 
True
1000 
ValueCountFrequency (%)
False 7161
87.7%
True 1000
 
12.3%
2023-02-17T19:02:45.884356image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

MVR_PTS
Real number (ℝ)

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.695503
Minimum0
Maximum13
Zeros3712
Zeros (%)45.5%
Negative0
Negative (%)0.0%
Memory size63.9 KiB
2023-02-17T19:02:46.121438image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile6
Maximum13
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.1471117
Coefficient of variation (CV)1.2663568
Kurtosis1.3781418
Mean1.695503
Median Absolute Deviation (MAD)1
Skewness1.3483359
Sum13837
Variance4.6100888
MonotonicityNot monotonic
2023-02-17T19:02:46.339771image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 3712
45.5%
1 1157
 
14.2%
2 948
 
11.6%
3 758
 
9.3%
4 599
 
7.3%
5 399
 
4.9%
6 266
 
3.3%
7 167
 
2.0%
8 84
 
1.0%
9 45
 
0.6%
Other values (3) 26
 
0.3%
ValueCountFrequency (%)
0 3712
45.5%
1 1157
 
14.2%
2 948
 
11.6%
3 758
 
9.3%
4 599
 
7.3%
5 399
 
4.9%
6 266
 
3.3%
7 167
 
2.0%
8 84
 
1.0%
9 45
 
0.6%
ValueCountFrequency (%)
13 2
 
< 0.1%
11 11
 
0.1%
10 13
 
0.2%
9 45
 
0.6%
8 84
 
1.0%
7 167
 
2.0%
6 266
 
3.3%
5 399
4.9%
4 599
7.3%
3 758
9.3%

CAR_AGE
Real number (ℝ)

Distinct30
Distinct (%)0.4%
Missing510
Missing (%)6.2%
Infinite0
Infinite (%)0.0%
Mean8.3283231
Minimum-3
Maximum28
Zeros3
Zeros (%)< 0.1%
Negative1
Negative (%)< 0.1%
Memory size63.9 KiB
2023-02-17T19:02:46.590845image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-3
5-th percentile1
Q11
median8
Q312
95-th percentile18
Maximum28
Range31
Interquartile range (IQR)11

Descriptive statistics

Standard deviation5.7007424
Coefficient of variation (CV)0.68450063
Kurtosis-0.74809176
Mean8.3283231
Median Absolute Deviation (MAD)5
Skewness0.28206372
Sum63720
Variance32.498464
MonotonicityNot monotonic
2023-02-17T19:02:46.763763image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 1934
23.7%
8 537
 
6.6%
9 526
 
6.4%
7 524
 
6.4%
10 469
 
5.7%
11 460
 
5.6%
6 451
 
5.5%
12 368
 
4.5%
13 356
 
4.4%
14 311
 
3.8%
Other values (20) 1715
21.0%
(Missing) 510
 
6.2%
ValueCountFrequency (%)
-3 1
 
< 0.1%
0 3
 
< 0.1%
1 1934
23.7%
2 12
 
0.1%
3 54
 
0.7%
4 135
 
1.7%
5 305
 
3.7%
6 451
 
5.5%
7 524
 
6.4%
8 537
 
6.6%
ValueCountFrequency (%)
28 1
 
< 0.1%
27 1
 
< 0.1%
26 2
 
< 0.1%
25 6
 
0.1%
24 10
 
0.1%
23 18
 
0.2%
22 27
 
0.3%
21 51
 
0.6%
20 90
1.1%
19 128
1.6%

URBANICITY
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size609.1 KiB
Highly Urban/ Urban
6492 
z_Highly Rural/ Rural
1669 

Length

Max length21
Median length19
Mean length19.409019
Min length19

Characters and Unicode

Total characters158397
Distinct characters17
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHighly Urban/ Urban
2nd rowHighly Urban/ Urban
3rd rowHighly Urban/ Urban
4th rowHighly Urban/ Urban
5th rowHighly Urban/ Urban

Common Values

ValueCountFrequency (%)
Highly Urban/ Urban 6492
79.5%
z_Highly Rural/ Rural 1669
 
20.5%

Length

2023-02-17T19:02:47.031678image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-17T19:02:47.410088image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
urban 12984
53.0%
highly 6492
26.5%
rural 3338
 
13.6%
z_highly 1669
 
6.8%

Most occurring characters

ValueCountFrequency (%)
r 16322
10.3%
a 16322
10.3%
16322
10.3%
b 12984
 
8.2%
n 12984
 
8.2%
U 12984
 
8.2%
l 11499
 
7.3%
/ 8161
 
5.2%
H 8161
 
5.2%
i 8161
 
5.2%
Other values (7) 34497
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 107762
68.0%
Uppercase Letter 24483
 
15.5%
Space Separator 16322
 
10.3%
Other Punctuation 8161
 
5.2%
Connector Punctuation 1669
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 16322
15.1%
a 16322
15.1%
b 12984
12.0%
n 12984
12.0%
l 11499
10.7%
i 8161
7.6%
y 8161
7.6%
h 8161
7.6%
g 8161
7.6%
u 3338
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
U 12984
53.0%
H 8161
33.3%
R 3338
 
13.6%
Space Separator
ValueCountFrequency (%)
16322
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 8161
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1669
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 132245
83.5%
Common 26152
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 16322
12.3%
a 16322
12.3%
b 12984
9.8%
n 12984
9.8%
U 12984
9.8%
l 11499
8.7%
H 8161
6.2%
i 8161
6.2%
y 8161
6.2%
h 8161
6.2%
Other values (4) 16506
12.5%
Common
ValueCountFrequency (%)
16322
62.4%
/ 8161
31.2%
_ 1669
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 158397
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 16322
10.3%
a 16322
10.3%
16322
10.3%
b 12984
 
8.2%
n 12984
 
8.2%
U 12984
 
8.2%
l 11499
 
7.3%
/ 8161
 
5.2%
H 8161
 
5.2%
i 8161
 
5.2%
Other values (7) 34497
21.8%

Interactions

2023-02-17T19:02:28.356282image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:04.020203image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:06.979505image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:09.860760image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:12.620826image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:15.286027image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:17.726528image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:20.600066image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:22.969241image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:25.627434image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:28.625110image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:04.378217image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:07.298176image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:10.137663image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:12.987260image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:15.616276image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:18.020278image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:20.840022image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:23.243873image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:25.893411image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:28.906401image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:04.679695image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:07.587609image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:10.479548image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:13.274073image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:15.874846image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:18.331809image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:21.127073image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:23.523993image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:26.183536image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:29.221140image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:04.954782image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:07.862703image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:10.750186image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:13.515174image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:16.128309image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:18.580743image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:21.357378image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:23.777103image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:26.452310image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:29.476004image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:05.208179image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:08.132233image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:11.005766image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:13.719065image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:16.352012image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:18.842449image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:21.591094image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:24.033354image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:26.815122image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:29.711529image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:05.520159image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:08.378152image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:11.254385image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:13.941126image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:16.546479image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:19.114448image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:21.788815image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:24.318311image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:27.045951image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:29.977514image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:05.850469image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:08.671662image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:11.532263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:14.273522image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:16.808944image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:19.447774image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:22.063026image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:24.596853image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:27.324790image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:30.224287image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:06.135667image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:08.935734image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:11.800961image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:14.536263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:17.011318image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:19.724882image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:22.272202image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:24.841843image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:27.574511image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:30.517699image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:06.411708image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:09.265367image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:12.062265image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:14.787632image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:17.257755image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:20.015611image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:22.499438image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:25.082338image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:27.846779image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:30.804712image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:06.702777image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:09.578885image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:12.333200image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:15.032136image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:17.488838image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:20.291877image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:22.736594image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:25.318022image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-17T19:02:28.101567image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-02-17T19:02:47.724173image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
INDEXTARGET_AMTAGEHOMEKIDSYOJTRAVTIMETIFCLM_FREQMVR_PTSCAR_AGETARGET_FLAGKIDSDRIVPARENT1MSTATUSSEXEDUCATIONJOBCAR_USECAR_TYPERED_CARREVOKEDURBANICITY
INDEX1.000-0.0020.037-0.0090.027-0.025-0.0120.0160.0080.0000.0000.0090.0000.0000.0180.0120.0110.0070.0110.0060.0000.000
TARGET_AMT-0.0021.000-0.1020.126-0.0550.055-0.0840.2320.195-0.0990.2250.0090.0750.0560.0000.0170.0160.0430.0240.0060.0320.044
AGE0.037-0.1021.000-0.5160.1440.007-0.002-0.034-0.0630.1790.1570.1650.3260.0950.0770.1230.1070.0610.0870.0760.0320.062
HOMEKIDS-0.0090.126-0.5161.0000.144-0.0130.0050.0460.055-0.1600.1270.3220.5300.0390.1300.1050.1070.0000.0510.0810.0470.061
YOJ0.027-0.0550.1440.1441.000-0.0050.016-0.018-0.0270.0390.0900.0790.0750.2520.1180.0630.2390.0620.0720.0720.0080.104
TRAVTIME-0.0250.0550.007-0.013-0.0051.000-0.0080.0080.009-0.0360.0580.0290.0320.0250.0000.0310.0420.0000.0000.0000.0070.173
TIF-0.012-0.084-0.0020.0050.016-0.0081.000-0.024-0.042-0.0010.0810.0000.0250.0000.0060.0000.0120.0000.0070.0270.0160.026
CLM_FREQ0.0160.232-0.0340.046-0.0180.008-0.0241.0000.414-0.0160.2410.0180.0570.0690.0000.0250.0260.0840.0350.0200.0720.271
MVR_PTS0.0080.195-0.0630.055-0.0270.009-0.0420.4141.000-0.0080.2220.0320.0690.0430.0150.0080.0220.0680.0290.0000.0610.148
CAR_AGE0.000-0.0990.179-0.1600.039-0.036-0.001-0.016-0.0081.0000.1040.0220.0630.0280.0160.4220.2330.0920.0560.0200.0180.177
TARGET_FLAG0.0000.2250.1570.1270.0900.0580.0810.2410.2220.1041.0000.1040.1570.1340.0180.1430.1820.1420.1420.0000.1510.224
KIDSDRIV0.0090.0090.1650.3220.0790.0290.0000.0180.0320.0220.1041.0000.2270.0390.0550.0320.0440.0050.0160.0500.0450.040
PARENT10.0000.0750.3260.5300.0750.0320.0250.0570.0690.0630.1570.2271.0000.4770.0730.0910.0900.0000.0560.0400.0480.019
MSTATUS0.0000.0560.0950.0390.2520.0250.0000.0690.0430.0280.1340.0390.4771.0000.0000.0510.0350.0170.0000.0150.0410.000
SEX0.0180.0000.0770.1300.1180.0000.0060.0000.0150.0160.0180.0550.0730.0001.0000.0430.2480.2790.7130.6660.0000.052
EDUCATION0.0120.0170.1230.1050.0630.0310.0000.0250.0080.4220.1430.0320.0910.0510.0431.0000.5620.2210.0940.0250.0170.234
JOB0.0110.0160.1070.1070.2390.0420.0120.0260.0220.2330.1820.0440.0900.0350.2480.5621.0000.5770.1340.1770.0280.310
CAR_USE0.0070.0430.0610.0000.0620.0000.0000.0840.0680.0920.1420.0050.0000.0170.2790.2210.5771.0000.5380.1890.0120.017
CAR_TYPE0.0110.0240.0870.0510.0720.0000.0070.0350.0290.0560.1420.0160.0560.0000.7130.0940.1340.5381.0000.4840.0260.074
RED_CAR0.0060.0060.0760.0810.0720.0000.0270.0200.0000.0200.0000.0500.0400.0150.6660.0250.1770.1890.4841.0000.0000.045
REVOKED0.0000.0320.0320.0470.0080.0070.0160.0720.0610.0180.1510.0450.0480.0410.0000.0170.0280.0120.0260.0001.0000.085
URBANICITY0.0000.0440.0620.0610.1040.1730.0260.2710.1480.1770.2240.0400.0190.0000.0520.2340.3100.0170.0740.0450.0851.000

Missing values

2023-02-17T19:02:31.331897image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-02-17T19:02:32.591545image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-02-17T19:02:33.402698image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

INDEXTARGET_FLAGTARGET_AMTKIDSDRIVAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSSEXEDUCATIONJOBTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCAR_AGEURBANICITY
0100.0060.0011.0$67,349No$0z_NoMPhDProfessional14Private$14,23011Minivanyes$4,4612No318.0Highly Urban/ Urban
1200.0043.0011.0$91,449No$257,252z_NoMz_High Schoolz_Blue Collar22Commercial$14,9401Minivanyes$00No01.0Highly Urban/ Urban
2400.0035.0110.0$16,039No$124,191Yesz_Fz_High SchoolClerical5Private$4,0104z_SUVno$38,6902No310.0Highly Urban/ Urban
3500.0051.0014.0NaNNo$306,251YesM<High Schoolz_Blue Collar32Private$15,4407Minivanyes$00No06.0Highly Urban/ Urban
4600.0050.00NaN$114,986No$243,925Yesz_FPhDDoctor36Private$18,0001z_SUVno$19,2172Yes317.0Highly Urban/ Urban
5712946.0034.0112.0$125,301Yes$0z_Noz_FBachelorsz_Blue Collar46Commercial$17,4301Sports Carno$00No07.0Highly Urban/ Urban
6800.0054.00NaN$18,755NoNaNYesz_F<High Schoolz_Blue Collar33Private$8,7801z_SUVno$00No01.0Highly Urban/ Urban
71114021.0137.02NaN$107,961No$333,680YesMBachelorsz_Blue Collar44Commercial$16,9701Vanyes$2,3741Yes107.0Highly Urban/ Urban
81212501.0034.0010.0$62,978No$0z_Noz_FBachelorsClerical34Private$11,2001z_SUVno$00No01.0Highly Urban/ Urban
91300.0050.007.0$106,952No$0z_NoMBachelorsProfessional48Commercial$18,5107Vanno$00No117.0z_Highly Rural/ Rural
INDEXTARGET_FLAGTARGET_AMTKIDSDRIVAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSSEXEDUCATIONJOBTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCAR_AGEURBANICITY
81511029100.0054.0013.0$81,818No$272,725YesMBachelorsManager18Commercial$19,6601Vanno$24,6901Yes64.0Highly Urban/ Urban
81521029200.0146.0012.0$45,018No$0z_NoMz_High Schoolz_Blue Collar26Private$15,0604Minivanno$33,0263No01.0z_Highly Rural/ Rural
81531029300.0048.0010.0$111,305No$0z_Noz_FPhDDoctor59Private$17,43013z_SUVno$00No418.0Highly Urban/ Urban
81541029500.0138.0416.0$12,717No$0Yesz_FBachelorsStudent15Commercial$24,7401Pickupno$9,2453No315.0Highly Urban/ Urban
81551029600.0041.007.0$6,256No$0z_NoMz_High SchoolStudent41Private$5,6001Pickupno$00No07.0z_Highly Rural/ Rural
81561029700.0035.0011.0$43,112No$0z_NoMz_High Schoolz_Blue Collar51Commercial$27,33010Panel Truckyes$00No08.0z_Highly Rural/ Rural
81571029800.0145.029.0$164,669No$386,273YesMPhDManager21Private$13,27015Minivanno$00No217.0Highly Urban/ Urban
81581029900.0046.009.0$107,204No$332,591YesMMastersNaN36Commercial$24,4906Panel Truckno$00No01.0Highly Urban/ Urban
81591030100.0050.007.0$43,445No$149,248Yesz_FBachelorsHome Maker36Private$22,5506Minivanno$00No011.0Highly Urban/ Urban
81601030200.0052.0011.0$53,235No$197,017Yesz_Fz_High SchoolClerical64Private$19,4006Minivanno$00No09.0z_Highly Rural/ Rural